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Abstract A key goal of systems biology is the predictive mathematical de- 
scription of gene regulatory circuits. Different approaches are used such as 
deterministic and stochastic models, models that describe cell growth and 
division explicitly or implicitly etc. Here we consider simple systems of un- 
regulated (constitutive) gene expression and compare different mathematical 
descriptions systematically to obtain insight into the errors that are intro- 
duced by various common approximations such as describing cell growth and 
division by an effective protein degradation term. In particular, we show 
that the population average of protein content of a cell exhibits a subtle 
dependence on the dynamics of growth and division, the specific model for 
volume growth and the age structure of the population. Nevertheless, the 
error made by models with implicit cell growth and division is quite small. 
Furthermore, we compare various models that are partially stochastic to in- 
vestigate the impact of different sources of (intrinsic) noise. This comparison 
indicates that different sources of noise (protein synthesis, partitioning in cell 
division) contribute comparable amounts of noise if protein synthesis is not 
or only weakly bursty. If protein synthesis is very bursty, the burstiness is the 
dominant noise source, independent of other details of the model. Finally, we 
discuss two sources of extrinsic noise: cell-to-cell variations in protein content 
due to cells being at different stages in the division cycles, which we show to 
be small (for the protein concentration and, surprisingly, also for the protein 
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copy number per cell) and fluctuations in the growth rate, which can have a 
significant impact. 

Keywords Genetic circuits • stochastic gene expression • noise • cell 
division ■ growth rate 



1 Introduction 

With the emergence of systems biology and synthetic biology, concepts and 
methods from mathematics, physics and engineering are increasingly used 
in the life sciences [T][T4ll2]. In particular, two central goals of this field are 
to predict the dynamics of gene expression based on mathematical descrip- 
tions of the genetic networks of a cell and to design genetic circuitry based 
on well-characterized regulatory elements [21,6,18,47). The progress of this 
research program has however also highlighted a number of generic compli- 
cations that arise from the fact that all genetic circuits function in a cellular 
chassis that itself is dynamic and adapts to external conditions, which can 
have unexpected effects on circuit function [25,43,40,27,29]. This observa- 
tion raises the question what mathematical description is appropriate for 
the description of genetic circuitry in a dynamic cell. In particular, even if 
the external conditions are constant and the cells exhibit 'balanced growth' 
(a steady state of all cellular parameters except for the overall exponential 
growth of the culture), each individual cell grows and divides and, while 
doing so, doubles its content of all cellular components. Some of the compo- 
nents will clearly affect the function of any gene circuit, the most important 
example being the duplication of the circuit genes themselves. In mathemat- 
ical models of genetic circuits, these effects are often ignored and described 
by an average gene copy number and an effective degradation of the protein 
that mimics the dilution of a protein concentration due to cell growth in the 
absence of its synthesis. In this article, we therefore ask how strongly cell 
growth within the division cycle affects gene expression and whether mod- 
els that do not describe growth and division explicitly introduce big errors 
through that approximation. 

Another facet of the question which mathematical description to use is the 
question whether such a description should be deterministic or stochastic. It 
has been realized in recent years that often the relevant molecules are present 
in the cell in low copy numbers, giving rise to large fluctuations and thus 
requiring a stochastic description of gene expression [3T)llTMl"5ll37ll2"2"] . The 
foundations for this view have been laid long ago [4TII3"]. but the progress in 
single-molecule and single-cell technology now allows the direct observation 
of these effects and the quantification of fluctuations from time series or from 
cell-to-cell variability j4^ |[4^1IT71[T5] . Stochasticity in gene expression has been 
studied extensively from a theoretical point of view, see e.g. [3,28,36,10,4, 
1201145 , 42 , 35 , 39] . Here we ask about the sources of stochasticity, as noise can 
be generated at many points in the process of protein synthesis and by the 
partitioning during cell division. Many of the noise sources have been studied 
before, but we are interested in a systematic comparison of their impact. 
Specifically, we ask whether there is a dominant source of noise, and whether 
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the noise predicted from models with explicit cell growth and division differs 
from what is obtained from implicit cell division models. 

It turns out that the question of stochasticity and the dependence of 
gene expression on the growth and division cycle are closely related: The 
variation of a protein concentration during the division cycle is observed 
as a cell-to-cell variation in that concentration in snapshots of cell cultures 
(where the division cycles of different cells are typically not synchronized, 
i.e. different cells divide at different times). We therefore also determine the 
effective 'noise' that arises from the dependence on the division cycle (which 
in fact is a deterministic component of the observed 'noise' and is seen as 
part of the so-called 'extrinsic noise' that is common to different genes [T5J 

W)- 

The paper is organized as follows: We start with deterministic descrip- 
tions of gene expression in section [2j where we discuss the effects of the 
division cycle and approximations that 'average out' the division cycle. In 
sections [3] and [4] we discuss several simple models that describe various pro- 
cesses of gene expression stochastically to address the question of the relative 
importance of various sources of stochasticity. We derive analytical results 
for some key characteristics of the noise. Here we focus on intrinsic noise, 
i.e. noise inherent in the synthesis and division process and specific to one 
gene. Extrinsic noise is discussed in section [5] where we come back to the 
dependence of protein concentrations on the division cycle and show that the 



effective 'noise' resulting from this dependence is small (section 5.1 1. In ad- 
dition, we also include a discussion of fluctuations of the growth rate (section 
5.2 ). We end with some general conclusions in section[6j where we summarize 



the relative importance of various sources of noise and cell-to-cell variations 
and discuss the minimal ingredients to arrive at realistic descriptions of gene 
expression. 



2 Deterministic descriptions of gene expression 

2.1 Basic model 



We will start by discussing a simple deterministic model of protein synthesis 
that accounts for the effects of the cell division cycle, specifically cell division 
itself and gene duplication, onto protein synthesis. Living cells grow and 
divide, while in the meantime, proteins are continuously synthesized inside 
the cell. We determine the amount of protein synthesized within a cell cycle 
and the corresponding concentrations for both exponential and linear cell 
growth. 

The number of copies of a specific protein in a cell, P(t), is described by 
the following dynamics: 

P = ag- pP, (1) 



where a is the protein synthesis rate, g is the gene copy number and /3 is 
the protein degradation rate (typical parameter values are summarized in 
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Fig. 1 Variation of the protein number P(t) (a) and concentrations p nn (t) and 
Pexp(t) (b) over the cell division cycle, (a) The protein copy number increases from 
Po = 7500 to 2Po = 15000 during a cell division cycle. Note that the protein syn- 
thesis rate doubles at time t x after each cell division, where the gene is replicated, 
(b) The corresponding protein concentration decreases transiently during the di- 
vision cycle. This effects is more pronounced for linear volume growth (solid blue 
line) than for exponential volume growth (green dashed line) within the division 
cycle. The parameters are a — 5000/T, T = 60 min, t x = 30 min, Vo = 0.5/im 3 . 



appendix |A.l ). Throughout this work, we will assume that the proteins are 
stable ((3 5 0) , as it is typically the case for bacterial proteins [32] . 

While the proteins are synthesized, the cell also grows and divides. Divi- 
sions take place at integer multiples of the doubling time T. Here we treat cell 
division as a deterministic process that occurs instantaneously. At the time 
of division, the amount of our protein of interest is divided equally among 
two daughter cells, so that its amount per cell is simply divided by 2. The 
same partitioning applies to all other contents of the cell, and therefore, in 
a steady state of growth, all content of the cell has to be doubled between 
divisions. Specifically, we are interested in the doubling of the gene that en- 
codes our protein of interest. This gene, which we assume to be present as a 
single copy in the genome of the cell, is doubled at a time t x after the last 
division (and, of course, divided by 2 at the time of division). Therefore, the 
gene copy number g that enters Eq. ([!]) is given by g — 1 for times < t < t x 
after division and by g = 2 for times t x < t < T '. Another important char- 
acteristic of the cells that has to double over the doubling time is the total 
cell mass or the cell volume. We will come back to this point below, when we 
discuss the concentration of the protein. 

Now we consider our gene to be in a 'steady state' of protein synthesis, 
in the sense that the protein level only depends on the time in the division 
cycle, but is the same if corresponding time points in different cycles are 
compared. In that case, the protein copy number at the end of the cycle is 
exactly twice that at its beginning, i.e. P(t — T) = 2Pq = 2P(t — 0) (here 
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and in the following, we measure time with respect to the time of division, i.e. 
assume that divisions take place at integer multiples of T). This condition, 
which can be considered as a singular boundary condition for Eq. [I] with 
times restricted to the interval [0, T], determines the time course of the copy 
number of our protein of interest per cell, 

p( , )= (a(t + 2T-t x ) for < t < t x 
K1 ]2a(t + T-t x ) for t x <t<T. {) 

Immediately after division, there are Po = a(2T — t x ) copies of the protein 
in the cell, and the same number is synthesized over the doubling time T 
(Figure [lj . This synthesis occurs in two phases, from one or two copies of 
the gene, respectively. One can define an effective synthesis rate a e ff = a(2 — 
t x /T), then the number of proteins synthesized over the division cycle has 
the intuitive form a e gT. 

We now turn to the corresponding concentration of the protein. This will 
be denoted by p and is given by p — P/V, the number of protein molecules 
per cell divided by the cell volume V. It therefore also depends on the time 
course of the cell volume over the division cycle. The functional form of that 
time dependence has been debated for a long time, see for example a recent 
discussion in ref. 13J . Here we use two models that have been proposed, 
namely linear and exponential growth of the cell volume, which we indicate 
by the subscripts 'lin' and 'exp', respectively. 

We denote the cell volume at the beginning of a cycle by V(t = 0) = Vq. 
In a steady state of growth, this volume must have doubled at the end of the 
cell cycle, such that V(t — T) — 2V$. Using this constraint, the cell volume 
V(t) is given by 



for linear and by 



V m (t) = V (l + t/T), (3) 



for exponential growth. As a consequence, the concentration of our protein 
at the beginning and at the end of a division cycle is equal, p(t = 0) = p(t = 
T) = Pq/Vq. However, it decreases between divisions as the protein copy 
number initially grows more slowly than the volume. When the gene is du- 
plicated, the protein copy number growth speeds up and becomes faster than 
volume growth and the concentration increases for times t x < t < T such 
that the concentration returns to its initial value. This temporary decrease 
of the protein concentration is more pronounced for linear than for exponen- 
tial volume growth, as can be seen in Fig. |ljb) . The extent of this decrease 
depends on the timing of gene duplication (which is dependent on the po- 
sition of the gene with respect to the origin of DNA replication p"2ll7] ) . For 
example, in the extreme case, where the gene is duplicated immediately after 
or before cell division, the protein content increases approximately linearly, 
and thus, for linear volume growth, the concentration is almost constant over 
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the division cycle. We will come back to this point in section |5.1| when we 
discuss the contribution of division cycle effects to the observed 'noise' in the 
protein content. 



2.2 Population averages 

The dynamics described so far is observable in experiments that track the 
content of specific proteins in single cells. Such experiments have been done 
(e.g., 49,44, fl ), although most of these studies were more focused on stochas- 
tic effects. In many experiments, however, what is observed is the population 
average of the protein content per cell. Unless the cell culture is specifically 
prepared to synchronize the division cycles of these cells, the population will 
consist of many cells (~ 10 9 in a typical bacterial culture) that divide in an 
asynchronous fashion. Averages of cellular properties over such populations 
will in general not only depend on the dynamics of the observable over the 
division cycle, but also on the age distribution in the population, i.e. the 
distribution of the time points in the division cycle at which these cells are. 
The latter depends on the experimental setup. We consider two cases, an ex- 
ponential and a constant age distribution. The exponential age distribution, 

, . , 21n2 / In 2 t\ 

4>(t) = exp =- , (5) 



T \ T 

applies to asynchronous cultures with an exponentially growing population 
size, where there are more young cells than old cells. The average age of a 
cell in such culture is (t) = t <j)(t) dt = T(l/ln2 - 1) « 0.44T. 

In addition we consider a constant age distribution, cj)(t) = 1/T, which 
is obtained if, for example, after each cell division only one of the daughter 
cells is kept and analyzed. An example of such an experimental setup is the 
'mother machine' that was described recently 48J. 

The protein copy number per cell, averaged over such an exponentially 
growing population, is given by 

(p) = j o p(tm)dt= — . (6) 

This result can be rewritten as (P) — a(g)/j3 c g with an effective degradation 
rate of the protein /3 e ff = (In 2) jT that describes the loss of proteins due to 
cell division (with half-life equal to the doubling time of the cells) , and the 
average copy number of the gene (g) — 2 1_tx / T ). For comparison, the average 
protein copy number per cell in a population with constant age distribution 
is (P) = oT[3- 2t x /T+ \{t x /T) 2 ]. Notice that this is in general not equal to 
3Po/2. The numerical comparison with Eq. ([6| shows that the average protein 
number is approximately 4 % larger with the constant age distribution than 
with the exponential age distribution. 

The average concentration can be calculated in the same way, but is more 
involved due to the age-dependence of the volume. We give only the result 
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for exponential volume growth and an exponential age distribution. In this 
case, we obtain 



T 



a 



T l/2 + 2- 2 *-/ T + 2hi2-ln2 t x /T 



(p) = J q p(t)4>(t)dt = - x -L — >J-. (7) 

This can be compared to the 'mean field' result (p(t)) ~ (P) / (V) that is 
obtained from the average protein number and the average volume. Using 
(V) = (21n2)Vb, that approximation leads to 

aT 2 1 ~ t */ T nA aT(g) 

A numerical comparison with the exact result shows that they differ by less 
than 0.3 % for all values of the replication time t x . Likewise, we find that the 
average concentrations for linear and exponential volume growth also differ 
only by a few percent. 



2.3 Averaging out the cell division cycle 

The observation that the 'mean field' approximation for the protein concen- 
tration given in Eq. [8] is rather accurate suggests that the dynamics on time 
scales that are longer than the generation time can actually be described by 
the following equation 

P=^-/W, (9) 

with /3 c ff = In2/T as before (or j3 c g = /3 + ln2/T, if the protein is unstable). 
The equation can also be interpreted as describing the dynamics of the aver- 
age concentration in a population of non-synchronized cells. Through /3 e ff , the 
equation describes the loss of protein due to growth and division of the cells 
as an effective degradation. As protein concentration is actually diluted out 
by volume growth throughout the division cycle (in contrast to the protein 
number per cell, which experiences dilution through instantaneous reduction 
by 50 % at division), and thus its variations through the cycle are relatively 
small (Fig. [T]d), this approximation can be expected to be quite good. The 
same approximation can also be used for the average protein copy number 
per cell, but there one has to keep in mind that variations over the division 
cycle that are neglected, are stronger, as the protein number P varies 2-fold 
over the cycle. 



2.4 A remark on messenger RNA 

Protein synthesis is a process that occurs in two steps, transcription and 
translation. In the first step, the gene sequence is copied into a mRNA, which 
subsequently serves as a template for protein synthesis. A more complete 
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description of the process thus describes the copy numbers of the protein 
(P) and of the mRNA (M), 

M = a m g - p m M 

P = a p M - (3 p P : (10) 

with a m , a p and f3 m , j3 p being the growth and degradation rates of mRNA 
and protein, respectively. In many cases, however, mRNA is rather short- 
lived and one can approximate the equation for M by its steady state, M = 
a m g/(3 m - In that case, we are back to Eq.nl with a — a p a m /(3 m . 

This approximation is specifically suitea for gene expression in bacteria, 
where typically mRNA lifetimes are of the order of a few minutes [5ll44j . 
while proteins, as mentioned above, are mostly stable [32ll38j . This means 
that when a gene is turned off and synthesis of the corresponding mRNA 
and protein is stopped, the mRNA will disappear with a half-life of a few 
minutes, while the protein is diluted out through cell growth and division 
and its half-life is given by the doubling time, which is typically of the order 
of 1 hour (the range for E. coli is 20 min - many hours). 



3 Sources of (intrinsic) stochasticity 

As mentioned in the introduction, the copy numbers of some proteins can be 
small, so that fluctuations play an important role, and stochastic descriptions 
of the dynamics of gene expression are required. In general, all steps in the 
synthesis pathway of proteins are stochastic processes. The same is true for 
the degradation of the protein if that protein is unstable. In addition, the 
partitioning of the copies of that protein during cell division also adds to 
the noise. We will now consider these different sources of noise separately 
to characterize the noise arising from different sources in a systematic wayr] 
In these considerations, we aim at understanding the relative importance 
of different sources of stochasticity rather than at accurately capturing the 
complicated processes that govern protein production in precise biological 
detail. Specifically we ask which sources contribute to the noise level observed 
in the protein number and whether there is a dominant source. In this sense, 
the most realistic model is the one that includes stochastic effects in all 
processes considered here, but we are interested in whether a reduced model 
may be sufficient. 

We use a bottom-up approach to study the contributions from cell di- 
vision, protein synthesis, and finally transcription and translation. We start 
with a stochastic version of the models described in section |2.1[ i.e. with 
models that treat protein synthesis as a simple one-step process. Effects that 
are due to the two-step nature of protein synthesis (transcription and transla- 
tion) will be discussed later in section]!] The most basic model thus describes 
protein synthesis and cell division, and we study three version of this scenario. 

1 There are some sources of noise that are specific to particular s ituatio ns, e.g. 
to highly transcribed genes with dense traffic of RNA polymerases |24II26| . These 
will not be considered here. 
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First, we take the partitioning of proteins into daughter cells upon division to 
be stochastic (section 3.1), but describe protein synthesis deterministically. 
Second, we treat protein synthesis as a stochastic process but partitioning 
during cell division as deterministic (section 3.2). Finally, both synthesis and 
cell division are considered as stochastic processes (section 3.3). Our analysis 
shows that the two noise sources contribute similarly to the overall noise, so 
none of the noise source is dominant. 

In section [4j we discuss models that explicitly treat protein synthesis as 
occurring in two steps, transcription and translation. The resulting noise is 
then characterized in terms of a parameter termed 'burst size', that char- 
acterizes the average number of proteins synthesized per mRNA copy. Here, 
high burstiness leads to a significant increase in the noise with bursty protein 
synthesis then being the dominant source of stochasticity. Thus, under the 
conditions of high burstiness, therefore a reduced model that neglects other 
sources of noise can provide a realistic description of the dynamics. 

All the sources of stochasticity we discuss here produce so-called intrinsic 
noise |T5] , i.e. the fluctuations are specific to the gene/protein under consid- 
eration and the fluctuations in the level of two different proteins are uncor- 
rected. Sources of extrinsic noise, which affects all genes will be discussed in 
section [H 



3.1 Stochastic partitioning during cell division 

We first consider the case, where protein synthesis is described by a deter- 
ministic process, but where proteins are distributed stochastically into the 
daughter cells during cell division. Specifically, we consider the case, where 
each copy of the protein has probability r = 1/2 to end in each of the 
two daughter cells. This means that in every generation a constant number 
Q = aT of proteins is newly synthesized, but the initial copy number of the 
protein at the beginning of the division cycle fluctuates due to the stochas- 
tic partitioning during cell division. Fig. [2]ja) shows a time series of such a 
process as obtained from simulations. 

For this CcLS6, cl number of characteristics can be obtained analytically 
using a method described in ref . [8] , which we summarize briefly in appendix 



A. 2 For example, the average copy number of the protein directly after cell 
division is (Pq) = Q = aT and the variance of that number is 5Pq — 2Q/3. 
Two commonly used characteristics of noise are the noise strength rj 2 defined 
as 

^ffi^pa (") 

and the Fano factor F = rj 2 (P) . rj 2 typically scales as rj 2 ~ 1/ (P) , so the 
latter parameter provides a characterization of the prefactor of that scaling. 
In our specific case, we obtain 
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or F = 2/3 (the index '0' in these expressions indicates that we have taken 
averages over a population of cells immediately after division), plotted in 
Figure [2jd). 



3.2 Stochastic protein synthesis 

Next we consider the stochasticity that is inherent in the protein synthesis 
process itself. To disentangle it from the effects of stochastic partitioning we 
first describe partitioning deterministically, i.e. we consider the case where 
each daughter cell inherits exactly one half of the protein molecules (Figure 

We consider again one lineage of cells. Between two cell divisions proteins 
are synthesized stochastically with rate a. At the time of cell division (integer 
multiples of the doubling time T), the protein number is divided by two (if 
the protein number P is an odd number, we take the number after division 
to be either (P + l)/2 or (P — l)/2, each with probability 1/2, so strictly 
speaking, there is a minimal remnant of stochasticity in our deterministic 
description of division as well). To keep the discussion simple, we assume 
here that the synthesis rate is constant, i.e. we neglect the fact that the 
synthesis rate changes upon duplication of the gene. We find 

aT 1 

(P )=aT, SP§ = — and % 2 - — (13) 

The last result implies that the Fano factor is Fq = 1/3, which is just half of 



what we have seen for stochastic partitioning (Eq. 12 ) 



3.3 Both sources combined 

Now let us combine the two sources of stochasticity discussed so far and 
consider the case where both protein synthesis and partitioning are stochastic 
(Figure |2p) . Using again the method of ref. [5] , we obtain 

(P ) = aT, SP* = aT, and »,§ = 1/<P ). (14) 

Two points are noteworthy here: (i) The noise strengths (ij 2 ) of independent 
noise sources are additive. In our case, the noise in Eq. [14] is the sum of 
the noise components that arise from stochastic partitioning (2/(3Po)) an d 
from stochastic synthesis (l/(3Po)). (ii) The contributions from both sources 
of noise are of the same order, there is no dominant source of noise in this 
simple case. 

For comparison, we also consider the corresponding model with implicit 
cell division, i.e. a stochastic version of Eq. ([9]), where the effect of pro- 
tein dilution through cell growth and division is described by an effective 
degradation rate /3 e ff = ft + In2/P. In this case, we end up with a simple 
birth-death process, where the number P of copies of our protein of interest 
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Fig. 2 Stochastic models of protein synthesis: (a)-(c) Trajectories of the protein 
copy number from stochastic simulations with stochastic synthesis, cell division, or 
both, all with cell division modeled explicitly, (d) Noise strength rj 2 as a function 
of the average protein copy number (P) (varied by varying the synthesis rate a) 
for the different models (for the models with explicit cell division, averages over 
cell immediately after division are plotted, i.e. t]q and {Po})- (e) Trajectory of the 
protein copy number for a model with implicit cell division, i.e. where cell division is 
described by an effective degradation rate /3 e ff. The corresponding curve in (d) lies 
on top of the curve for stochastic synthesis and stochastic division. The parameter 
values used for these plots are a — 0.5/min, T = 40 min, and, in (e), /3 = In2/T. 



increases with constant rate a and decreases with rate /3 e gP, described by 
the following master equation 



dV{P,t) 
dt 



a[V(P-l,t)-V(P,t)} 

+/3 cff [(P + l)V(P + ht) - PV{P,t)\ , 



(15) 



where V(p,t) is the probability to have P proteins at time t. The moments 
of that distribution in the steady state (P n ) can easily be calculated by 
multiplying the master equation with powers of P and summing over P. For 
this type of model, the protein copy number does not exhibit the periodic 
behavior seen in the models with explicit cell division, but rather fluctuates 
around a constant mean value (P) = a//3 e g in the steady state (Figure^). 
These fluctuations are characterized by rj 2 — l/(P), so the Fano factor is 
the same as Fq for the case with explicit cell division discussed before. This 
indicates that using models with implicit cell division (which by the choice of 
/3 e ff are constructed to correctly describe the dynamics of the mean protein 
number on time scales that are long compared to the generation time T) also 
provide a good description of the fluctuations in such a system. 
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Fig. 3 Burstiness of protein synthesis: (a)-(c) Trajectories of the protein copy 
number from stochastic simulations with (a) a one-step model of protein synthesis, 

(b) a two-step model (transcription and translation) with low burstiness, and (c) 
a bursty two-step model. All three cases are for implicit cell division and exhibit 
the same average protein copy number, (d) Noise strength rf for bursty protein 
synthesis with exponential burst size distribution (as in the two-step models) or 
with constant burst sizes as a function of the average protein copy number (varied 
by varying a m ). (e) Fano factor for models with either implicit or explicit cell 
division as function of the burst size b. The parameter values are (a) a = 2/min, 
Pee = 0.01/min, (b) a p = 0.4/min, /3 P — 0.01/min, a m — 10/min, fi m — 2/min, 

(c) a p — 10/min, /3 P = 0.01/min, a m — 0.4/min, j3 m = 2/min, (d) P p = 0.01/min, 
and (e) j3 p — 0.01/min, a m = 2/min, f3 m — 5/min, T = 60 min. 



4 Bursts of protein synthesis 



As discussed in section [24} the two-step nature of protein synthesis can often 
be neglected as mRNA levels evolve on faster time scales than protein levels, 
and therefore the dynamics of mRNA can be approximated by its steady 
state. However, while absorbing the mRNA degrees of freedom into effective 
protein synthesis results in a correct description of the average protein level, 
it generally underestimates fluctuations, as it smoothens out the 'bursty' 
nature of protein synthesis resulting from the two-step process. This was 
realized first by Berg in 1978 [3] and has been studied extensively in recent 
years, as experimental techniques to count proteins in individual cells were 
developed 0|li|33]. 

To keep the discussion simple, we start with the stochastic version of 



Eq. (10 1, i.e. with a model that describes cell division by an effective pro- 



tein degradation [45] • The mRNA part of Eq.(10l, follows the same dynam- 
ics as the protein in Eq. |i"5[ ) and is thus characterized by the same noise 
Vm = V(-^0 with (M) = a m //3 m . However the protein number, P, behaves 
differently and is characterized by (P) = a m a p / /3 m (3 p and rjp = (1 + b)/(P) 
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05], where b = a p /((3 p + (3 m ) ss a p /(3 m is called the 'burst size' and describes 
the average number of proteins synthesized per copy of the mRNA or the am- 
plification of transcription by translation. Experimentally determined burst 
sizes range between 1 and 10 [lOTH] . The increase in noise can be interpreted 
as an additional (independent) source of noise that arises from the stochas- 
tic amplification of the transcription output by translation. This additional 
noise is characterized by a noise strength b/{P) that is added to the noise 
already present from stochastic protein synthesis and degradation/dilution 
in the absence of stochastic amplification. 

The bursty nature of these processes is shown by cases with low tran- 
scription rate: In this case, protein synthesis events are rare (as transcripts 
are produced infrequently), but multiple copies of the protein are generated 
in every synthesis event. The increase in fluctuations for the case of bursty 
synthesis is illustrated in Fig. [3j where we plot trajectories for three cases 
with the same average protein number. In Fig. [3^ a) , protein synthesis is de- 
scribed by a single step with rate a = a m a p / /3 m ss a m b, in Fig. pnb) and (c) 
protein synthesis is described as a two-step process. However, while in Fig. 
(3^b) the transcription rate is large and the translation rate is small (b ~ 0.2), 
the translation rate is large and the transcription rate is small in Fig. [3^c), 
resulting in bursty protein synthesis (with b ~ 5). 

It is worth mentioning here that the bursts on the one hand amplify the 
noise from transcription, but on the other hand also create additional noise as 
the size of each burst is a stochastic quantity. To disentangle these two effects, 
we determine the noise strength r/ 2 for a one-step model of protein synthesis, 
where however b copies of the protein are produced in every synthesis event. 
In that case, the burst size does not fluctuate, but bursts can still amplify 
the noise from the one-step synthesis process that mimics transcription. Th is 
case can be solved using a modified version of the master equation (fl5Jn 
and leads to a noise strength -q 2 = (1 + b)/(2(P)) with (P) = b x a/]3^g. 
This is exactly half of what we have obtained for exponentially distributed 
burst sizes in the two-step model (see also Fig. 3]^d), where we plot the noise 
strength for both constant and exponentially distributed burst sizes) (^] This 
result indicates that the two effects of bursting contribute equally to the 
increased noise. 

The model discussed so far describes cell division implicitly as an effective 
protein degradation, but models with explicit stochastic cell division exhibit 
the same burstiness behavior. This is shown in Fig. |3je), where we plot the 
Fano factor Fp.o = r/p (Pq) for a model where both protein and mRNA are 



divided stochastically between daughter cells as in section 3.3 Fp^ shows 
the same dependence on the burst size except for a different prefactor of the 
linear term (« In 2), which arises from the fact that averages are taken over 



2 The first term is replaced by aP(P ~b,t) x 0(P — b), where is the Heaviside 
function with 0(P - b] ) = 1 for P > b and 0(P - b) = for P < b. 

3 For constant burst sizes, the values of b must be integers and that the result 
for a single-step protein synthesis is recovered for 6 = 1, where every transcription 
event leads to the synthesis of exactly one protein molecule. With stochastic burst 
sizes, however, b can have non-integer values and the single-step process is recovered 
by taking the limit 6 — > 0, while keeping b x a m constant. 
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slightly different populations (over cells immediately after division vs. over 
age- less cells representing an average over the division cycle). 

Finally, we want to mention that burstiness can also arise from other 
physical processes than from multiple translations of a transcript. For exam- 
ple, bursts have been demonstrated experimentally to occur on the level of 
transcription |17j . which can be interpreted as resulting from the stochas- 
tic switching of the gene between two activity states (transcription 'on' or 
'off'). The molecular origin of these activity states remains however unclear]^] 
although several mechanisms have been proposed (e.g. states of chromo- 
some structures, binding/unbinding of transcription factors, etc. [3Tll46j ) . 
In a genome-wide study, the Fano factors for mRNA were found to range 
mostly between 1 and 2, larger than what is expected for a single-step (Pois- 
son) synthesis, but not much larger |4"4"] . 



5 Extrinsic noise 

So far, we have discussed intrinsic noise in gene expression, i.e. noise that is 
specific to a particular gene or protein and results from the inherent stochas- 
ticity of the synthesis and degradation of that protein. As we have seen, 
a characteristic property of intrinsic noise is its scaling proportional to the 
inverse of the average protein number in the cell. We now turn to extrin- 
sic noise, fluctuations of cellular parameters that affect all genes/proteins 
in a cell. Such noise has first been demonstrated by a study of the correla- 
tions between the reporter proteins expressed from two copies of the same 
operon |15j . For highly abundant proteins, intrinsic noise becomes negligible 
and the extrinsic component of the noise, which does not depend on protein 
abundance, is dominant with fluctuations of about 30% in the protein con- 
centration as shown by a study of a library of fluorescent reporter proteins 
[44j . There are many possible sources of extrinsic noise such as fluctuations 
in the concentrations of essential components of the transcription and trans- 
lation machinery or mRNA degradation enzymes (RNA polymerases, ribo- 
somes, RNases). Here we consider two effects that should be present even if 
such fluctuations are suppressed by feedback mechanisms for the synthesis 
of these machines: cell-to-cell variations arising from different cell ages in a 



(section 5.2) 



population (section 5.1| and effects due to fluctuations in the growth rate 



We note that another definition of extrinsic and intrinsic noise has been 
given in ref. [34] . There, the distinction between extrinsic and intrinsic noise 
is not based on distinguishing a specific genetic system and its environment, 
which affects different genes in the same way, but on the dependence of the 
noise on the average protein number. One component of the noise exhibits 
the characteristic l/(-P) behavior and is classified as intrinsic, while the com- 
ponent of the noise that does not exhibit this behavior and depends on the 
fluctuation of a variable that influences the protein synthesis rate is be clas- 
sified as extrinsic. The two cases we consider here are extrinsic according 



4 In eukaryotic systems, they are believed to mostly reflect different states of the 
chromatin structure. 
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to both definitions, but based on the definition of ref. [31], one could, for 
example, consider the noise from transcription as extrinsic to translation. 



5.1 Effects of the division cycle 

In section [2j we have seen that the protein concentration varies systemati- 
cally over the course of a division cycle. In a population of non-synchronized 
cells, this age-dependence of the protein content is observed as a cell-to-cell 
variation that forms part of the extrinsic noise. To study the effect of age- 
dependent protein content and to estimate what part of the extrinsic noise 
can be understood from such deterministic variation, we now determine the 
distributions of the protein number and concentration over the division cycle. 
As for the average protein number calculated in section[2j we have to take the 
age distribution of the experimental culture into account. We consider again 
the case of a single lineage and of an exponentially growing population, i.e. 
a constant and an exponential age distribution as given in Eq. [5] We denote 
the resulting protein copy number distributions by <P(P) and ty(P). They 
can be calculated by inverting the time dependence of P(t) and using the 
inverse relation t(P) for a transformation of variable in the age distribution, 
see appendix |A.3| 

The distributions for both types of cell culture are presented in Fig. [4] 
Panel (a) shows the distributions of protein number and concentration for 
a single lineage, @(P) and <P(p), respectively. The concentration distribution 
was determined for both linear and exponential volume growth, the distribu- 
tion of the protein copy number, <P(P) (top panel), exhibits two flat plateaus. 
The probability to find a protein number P < P(t = t x ) that is seen prior 
to the replication time t x is twice as high as for a protein number that cor- 
responds to larger times, P(t > t x ), as the synthesis rate doubles at time 

For the concentration subject to linear volume growth (middle panel), 
<P(p lin (t)) is almost flat with a minimum for intermediate concentrations. 
In the case of exponential volume growth (bottom panel), <£(p cxp (i)), which 
is quite flat for small concentrations, rises sharply towards the maximum 
concentration. 

It is worth noting that while the protein copy number exhibits a broad 
distribution over a two-fold range, defined by the copy numbers directly be- 
fore and after cell division, the range over which the concentration varies 
is much smaller: The maximal concentration is only s=sl3% larger than the 
minimal concentration for linear volume growth and even less (« 6%) for 
exponential volume growth). 

Figure [4] (b) shows the corresponding results for an exponentially growing 
culture (with an exponential age distribution). The distribution for the pro- 
tein number (top panel), &(P), still exhibits two plateaus, which are tilted 
towards smaller values of P as the age distribution gives more weight to 
younger cells. The distributions of the concentrations (middle and lower 
panel), ^(p,^) and <P(p exp ), are not radically altered by the change in age 
distribution. 
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Fig. 4 Distributions for protein number and concentrations as arising from the 
deterministic variation over the division cycle, (a) Distributions for the protein 
number 9&(P) and concentration $(pi in (£)) and $(p cxp (f)) (for the case of linear and 
exponential volume growth, respectively) for a single lineage, (b) Distributions for 
the protein number "^(P) and concentrations ^(pnn) and "^(pcxp) for an exponen- 
tiallygrowing cell population with age distribution 4>(t). The parameters are as in 

Fig-13 



Next, we determine the noise parameter that characterizes the variation 
over the division cycle, which in analogy to Eq. [TT] can be defined as 

rf = SP/(P) 2 = J ° { p )2 !> Vy . (16) 

for the protein copy number and likewise, Vp lin = 3Pn a / (Pun) 2 or ?7p cxp = 
5p oxp /(p exp ) 2 , for the concentration (for linear and exponential volume growth, 
respectively). One parameter that affects the extent of this deterministic cell- 
to-cell variation is the replication time t x , which depends on the genomic lo- 
cation of the gene of interest relative to the origin of replication. In Fig. [5j we 
show the noise parameters for the protein copy number and the concentration 
as functions of t x in the range of < t x < T. 

We first note that these noise parameters only depend on T and t Xl or, 
more precisely, on their ratio t x /T. Specifically they are independent of the 
protein synthesis rate a and, in case of the concentrations, the initial cell 
volume Vq. Therefore, this contribution to the observed noise does not de- 
crease with increasing protein concentration and does not become negligible 
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Fig. 5 Noise parameter rj 2 arising from deterministic variations over the division 
cycle for (a) the protein number P and (b) for the protein concentration, rip,, and 
f)p exp (with linear and exponential volume growth) as functions of the replication 
time t x . 



for abundant proteins. However, the overall contribution of the division cycle 
to the noise is relatively small. For the protein concentration, the noise r]p lia 
in the case of linear growth is on the order of 0.001 (solid line in Fig.^), and 
for exponential growth ?7p cxp < 5 • 10~ 4 for all values of t x . These values cor- 
respond to 2-3 % variation of the concentration and are considerably smaller 
than the observed extrinsic noise, which is of the order of rj 2 ~ 0.1 |H]. For 
the protein copy number, which varies over a wider range, the effect of the 
division cycle is more pronounced and varies by about one fourth of its mean 
over different values of t x . However, even here, the absolute value of r? 2 , being 
on the order of 0.04, remains rather small. We can thus conclude that, while 
the division cycle contributes to the observed extrinsic noise, other sources 
of extrinsic noise are more dominant. 



5.2 Fluctuations of the growth rate 

The fluorescent reporter protein library study of Taniguchi et al. mentioned 
above [44] showed that abundant proteins exhibit extrinsic noise that does 
not display the inverse scaling with the mean protein concentration. The 
same study also revealed some additional characteristics of that noise: In 
particular, (i) there are correlations between the noise of different extrinsic 
proteins, a defining feature of extrinsic noise [15] . and (ii) the extrinsic fluctu- 
ations are slow, with variations in the protein concentration over timescales 
longer than the generation time |44) . Moreover, they come together with sub- 
stantial fluctuations of the generation time. We therefore ask now whether 
fluctuations in the growth rate may substantially contribute to the observed 
extrinsic noise. 
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For an estimate of the effect of a fluctuating growth rate, we make the as- 
sumption that while the doubling time fluctuates slowly, the protein synthesis 
rate per cell volume a/V remains approximately constant. This condition is 
(approximately) satisfied by the population average of the synthesis rate as a 
function of growth rate when the growth rate is systematically varied by us- 
ing different growth media [25] . It basically means that changes of the growth 
conditions, while affecting the synthesis rate of protein numbers, do not af- 
fect the rate of synthesis of protein concentration. Only the effective degra- 
dation is changed when the growth rate changes. Under balanced growth, 
this constancy is the result of the combination of several factors (such as the 
availability of RNA polymerases and ribosomes, the gene copy number etc. 
[2"5ll23| ) that do change, but in such a way that their combined effect cancels 
out (with the exception of conditions of very slow growth) [25] . Obviously, 
it is not clear that this assumption holds for slowly varying growth rates in 
individual cells; in principle, all factors that contribute to the growth-rate 
dependence of protein concentrations could vary in a mutually independent 
fashion, but we can consider the case where they vary together as one that 
provides a lower limit for the resulting noise. With a deterministic descrip- 
tion of protein synthesis, we obtain p = (a/V) x T / In 2, so fluctuations of T 
are directly carried over into fluctuations of the protein concentration p. If T 
fluctuates by some time AT (of about 10-25 % of the doubling time), p will 
fluctuate by Ap = a AT /(Fin 2) or also about 10-25 %, as Ap/p = AT/T. 
This would correspond to a noise parameter t] 2 of 0.01-0.08. While this sim- 
ple estimate is certainly not an accurate description of such global noise, it 
clearly indicates that fluctuations in the growth rate can lead to noise in 
protein concentrations of the order of the observed extrinsic noise [44] . 



6 Concluding remarks 

In this article, we have discussed several ways of describing gene expression 
with deterministic or stochastic models. Deterministic models that explic- 
itly describe cell division, gene duplication, and volume growth provide a 
detailed description of the dynamics over both short and long time scales 
(compared to the doubling time). We have shown that the results depend 
generally on specific details of the model such as how volume growth is im- 
plemented and the age structure of the population over which averages are 
taken. Fortunately, however, these differences are not dramatic. Moreover, a 
mean-field-like approximation that describes protein synthesis by an effec- 
tive rate per volume given by the average gene copy number and the average 
cell volume provides a good approximation that averages over the detailed 
dynamics within the division cycle. Nevertheless, it is worth keeping in mind 
that there are all these subtle effects as well as to carefully distinguish dif- 
ferent normalizations of protein amounts or synthesis rates such as per gene 
(e.g. a), per cell (a(g)) and per volume (a(g) /(V)). This is particularly im- 
portant for studies that address the coupling of gene expression and global 
cellular physiology, where quantities such as the average gene copy number 
and the average volume per cell may change |25j . 
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With respect to the fluctuations around this average behavior, we have 
compared several simple models to disentangle the contributions of different 
sources of noise. This comparison shows that the noise contributions from 
sources such as stochastic protein synthesis or degradation and stochastic 
partitioning during cell division are all of the same order and that there is no 
single dominant noise source, except when protein synthesis is pronouncedly 
bursty. The burstiness of protein synthesis is the largest contribution to the 
noise (with a Fano factor rj b, while the other noise sources have Fano factors 
of fractions of 1). If b is large, this is clearly dominant, and one could neglect 
all other sources of noise. The study of Taniguchi et al. 44J , however, indicates 
that typical values of b for many low-abundance proteins are in the range 1— 
10 and thus are not necessarily very dominant. In many cases, a realistic 
description of the dynamics of expression of low-abundance proteins will 
therefore need to include all these sources of noise. 

For intermediate-abundance to high- abundance proteins (with (P) > 20), 
the noise is dominated by extrinsic noise [44] . Here we have considered two 
sources of extrinsic noise: We have shown that the deterministic contribu- 
tion from systematic variation over the division cycle is rather small (even 
for the protein copy number, but in particular for the concentration), while 
fluctuations in the growth rate can be expected to give a larger contribution. 
These results suggest that a model that incorporates the burstiness of pro- 
tein synthesis and fluctuations in the growth rate might provide a minimal 
description of stochastic effects in gene expression that is able to describe 
both intrinsic and extrinsic components of the noise. 
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A Appendix 

A.l Typical values of the parameters 

Estimates of typical parameter values in the model organism E. coli are summar ized 
in Table [T] Most of these can, for example, be estimated from the data of ref. [44] . 
A few ofthem require additional comments: (i) In E. coli proteins are typically 
stable, i.e. fj p m 0. So far, no complete survey of protein stability has been made, 
but the total cellular protein mass was found to be stable (32] and early proteomics 
studies (2d-gels) also indicated that almost all proteins covered by their approach 
were stable |38| . Nevertheless, some proteins are known to be unstable and, in these 
cases, /3 P can be of the order of 1 min . (ii) Genes are typically present as a single 
copy in the genome. This means that the gene copy number per cell is 1 before the 
gene is replicated and 2 after replication. Average gene copy numbers are between 
1 and 2, except at fast growth with doubling times T < 60 min, wher e rounds of 
DNA replication overlap and the gene copy numbers can be larger [12117] . (iii) The 
cell volume doubles over the division cycle and its average value depends on the 
growth conditions [7]. The value given in the table should be taken as an order or 
magnitude estimate. 
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A. 2 Models with stochastic protein synthesis and stochastic division. 

A general method for solving processes involving different rules of protein synthesis 
and cell division has been described in ref. [8|. This method allows us in most of 
the cases to find averages and standard deviation of the protein number. We will 
describe the method briefly here following [8] . Let P n be the protein content in the 
n th generation immediately after the cell division. Let A n be the amount of protein 
produced and accumulated till the cell division time in generation n and q n be the 
fraction of protein inherited by the daughter cell at the time cell division. Then 
one can write 

P n+1 = q n (P n + A„). (17) 

The protein generation as well as division can be taken from some distributions. If 
these distributions admit finite moments then in the steady-state the distributions 
of A and q become independent and hence one can write 

(P k ) = (q k ) ((P + X) k )- (18) 

From here one can get all the moments for P, in particular (P) — (A). Let us 
consider an example where we add protein with rate a in between every two cell 
divisions and where the protein number is divided deterministically into half at 
every cell division after every T time. In this case the synthesis of protein follows 
a binomial distribution giving (A) = S\ 2 = oT and the division fraction is given 



gives (P) = (A) and (P 2 ) = § (2(A) 2 + (A 2 )). After some algebra one finds rf 
1 — - — ^-t— — — - — which is one of the cases discussed in the main text. 



by a delta function 5(q - 1/2) with (q) = 1/2 and (q 2 ) - (q) 2 = 0. Thus Eq.(18> 

i 

3 

(P 2 ) ' = Wa> 



A. 3 Distribution of protein number and concentration due to variation over 
the division cycle 

The distribution of the protein number discussed in section |5.1| is obtained by 
inverting the time-dependence of the protein copy number, P(t) to obtain t(P) 
and a transformation of variables in the age distribution from t to P, which leads 
to 

HP) = (Jpt(P)) 4>{t(P))- (19) 



Table 1 Typical parameter values for E. coli cells 



parameter 


symbol 


typical range 


comments 


transcription rate 


«m 


0.1-10 min" 1 




mRNA degradation rate 


Pm 


0.2-2 min" 1 




translation rate 


Up 


1-10 min -1 




protein degradation rate 


Pp 


w 


see text 


gene copy number 


g 


1-2 


see text 


division time 


T 


20 min - hours 




average cell volume 


(Vo) 


W l^im 3 


see text 


effective synthesis rate 


a 


0.1-500 min" 1 


= a m ot p g/ p m 


burstiness 


b 


1 - 50 


~ a p //3m 


effective degradation rate 


/Jeff- 


~ 0.01 min -1 


= /3 p + \n2/T 
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Specifically, for the constant age distribution that describes averages over a single 
lineage, this leads to &(P) = 4pt(P). As a consequence, the result for an arbitrary 
age distribution can be rewritten as 

tf(P) = *(P)0(t(P)), (20) 

i. e., the distribution of protein number in a single lineage weighted with the age 
distribution of the corresponding inverse. 

The distributions for the concentrations are obtained in an analogous fash- 
ion, but the calculation is technically more involved as the concentration is not a 
monotonic function of time (see, e.g. Fig.[T|. We thus split the functions pii n (t) and 
Paxp(i) into piecewise monotonic functions and determine the distributions for these 
separately. The concentration for linear cell growth, p lirl (t), is monotonic in the in- 
tervals [0,t x ] and [t x ,T], and for p oxp (i) we have three intervals [0, t x ], [t^imax] 
and [i max , T\, where i max is the time where p oxp (i) is maximal. The complete distri- 
butions $(pi in (t)) and <5(p exp (t)) are then obtained by adding up the distributions 
from the respective intervals. The distributions for the concentrations ^(p), are 
again obtained for the corresponding intervals, weighted with the age distribution 
and summed up to yield the full distribution. 
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